AMENDMENTS TO THE SPECIFICATION 
Page 1, after the title, please insert the following: 

--Related Applications 

This application is a divisional of U.S. Patent Application Serial No. 09/562,916 filed 
May 2, 2000 and entitled Construction of Trainable Semantic Vectors and Clustering, 
Classification, and Searching Using Trainable-Semantic Vectors, which claims priority from 
U.S. Provisional Patent Application No. 60/177,654, filed January 27, 2000, which is 
incorporated herein by reference.- 

Please amend page 17, line 7 through line 9 as follows: 

A weighted average of u and v can also be used to determine the significance of 
data points, according to the following formula: 
TSV- a(v) + (l-a)(K) 

Please amend page 36, line 2 through line 16 as follows: 

Summation of the total number of columns 212 across each row 210 provides the 
total number of documents that contain the word represented by the row 210. These values are 
represented at column 216. Summation of all th e rows 210 across a column 212 provid e s th e 
numb e r of docum e nts within the category represent e d by that column 212. This is shown in 
Figure 8 using reference numeral 218. Referring to Figure 8 word Wj appears twenty times in 
category Cat2 and eight times in category Cats. Word Wi does not appear in categories Catj, 
Cat3, and Cat4. Referring to column 216, word Wi appears a total of 28 times across all 
categories. In other words, twenty-eight of the documents classified contain word . 
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Examination of a an exemplary column 212, such as Cat i, reveals that word W2 appears once in 
category Catj, word W3 appears eight times in category Cat], and or W5 appears twice in 
category Cat] . Word W4 does not appear at all in category Cat] . As previously stated word Wi 
does not appear in category 1. Referring to row 218, the entry corresponding to category Cat\ 
indicates that there are eleven documents classified in category Catj. 

Please amend page 36, line 17 through page 37, line 4 as follows: 

With continued reference to Figure 8, Figure 9 illustrates a table 230 that stores the values 
that indicate the relative strength of each word with respect to the categories. Specifically, the 
percentage of data points occurring in each category (i.e., u) is presented in the form of a vector 
for each word. The value for each entry in the u vector is calculated according to the following 
formula: 

u = Prob (entry | category) = (entry n , category m )/category m t otai 
Table 230 also presents the probability distribution of a data point's occurrence across all 
categories (i.e., v) in the form of a vector for each word. The value for each entry in the v vector 
is calculated according to the following formula: 

v = Prob (category | entry) = (entry entry n , category m )/entry n _ to tai 
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